SAHG, a comprehensive database of predicted structures of all human proteins
نویسندگان
چکیده
Most proteins from higher organisms are known to be multi-domain proteins and contain substantial numbers of intrinsically disordered (ID) regions. To analyse such protein sequences, those from human for instance, we developed a special protein-structure-prediction pipeline and accumulated the products in the Structure Atlas of Human Genome (SAHG) database at http://bird.cbrc.jp/sahg. With the pipeline, human proteins were examined by local alignment methods (BLAST, PSI-BLAST and Smith-Waterman profile-profile alignment), global-local alignment methods (FORTE) and prediction tools for ID regions (POODLE-S) and homology modeling (MODELLER). Conformational changes of protein models upon ligand-binding were predicted by simultaneous modeling using templates of apo and holo forms. When there were no suitable templates for holo forms and the apo models were accurate, we prepared holo models using prediction methods for ligand-binding (eF-seek) and conformational change (the elastic network model and the linear response theory). Models are displayed as animated images. As of July 2010, SAHG contains 42,581 protein-domain models in approximately 24,900 unique human protein sequences from the RefSeq database. Annotation of models with functional information and links to other databases such as EzCatDB, InterPro or HPRD are also provided to facilitate understanding the protein structure-function relationships.
منابع مشابه
Accidental interaction between PDZ domains and diclofenac revealed by NMR-assisted virtual screening.
In silico approaches have become indispensable for drug discovery as well as drug repositioning and adverse effect prediction. We have developed the eF-seek program to predict protein-ligand interactions based on the surface structure of proteins using a clique search algorithm. We have also developed a special protein structure prediction pipeline and accumulated predicted 3D models in the Str...
متن کاملDesign and Implementation of a Comprehensive Database of the Written Heritage of Science and Technology
Purpose: This study aims to design and implement a comprehensive database of the written heritage of science and technology in the Regional Information Center for Science and Technology (RICeST) and determine the metadata elements required to describe the manuscripts. Method: This study was carried out by the content analysis method to identify the metadata elements needed to describe the coll...
متن کاملبررسی تمایل مجاورت اسیدهای آمینه با یکدیگر در مارپیچهای آلفا
In order to study the tendency of amino acid neighbors in helical structures, proteins with known structures were carefully analyzed. The studied helical positions: N , Ncap, N1, N2, N3, N4, M, C4, C3, C2, C1, Ccap, C and their doublet counterparts: N Ncap, NcapN1, N1N2, N2N3, N3N4, M1M2, M2M3, C4C3, C3C2, C2C1, C1Ccap, CcapC were carefully analyzed. The propensity for all amino acids i...
متن کاملI-49: Human Y Chromosome ProteomeProject
The success of the Human Genome Project (HGP) has provided a blueprint for the approximately 20,000 gene-encoded proteins potentially active in all of the hundreds of cell types that make up the human body. Yet we still have limited knowledge about a majority of the gene-encoded proteins which are the “building blocks of life” and “cellular machinery”. It is estimated that for nearly half of th...
متن کاملDesign of a Multi-epitope Peptide Vaccine against SARS-CoV-2 based on Immunoinformatics Data
Background and purpose: In 2019, the world has witnessed the emergence of a virus that caused acute respiratory distress syndrome in human with high mortality rates (approximately 3.7%). So far, no effective treatment has been proven against COVID-19. This study aimed at designing a multi-epitope vaccine combining several T-cell and B-cell epitopes of the SARS-CoV-2. Materials and methods: Bas...
متن کامل